AITopics | action abstraction

Collaborating Authors

action abstraction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Learning to Reason as Action Abstractions with Scalable Mid-Training RL

Zhang, Shenao, Yu, Donghan, Feng, Yihao, Jin, Bowen, Wang, Zhaoran, Peebles, John, Wang, Zirui

arXiv.org Machine LearningOct-7-2025

Large language models excel with reinforcement learning (RL), but fully unlocking this potential requires a mid-training stage. An effective mid-training phase should identify a compact set of useful actions and enable fast selection among them through online RL. We formalize this intuition by presenting the first theoretical result on how mid-training shapes post-training: it characterizes an action subspace that minimizes both the value approximation error from pruning and the RL error during subsequent planning. Our analysis reveals two key determinants of mid-training effectiveness: pruning efficiency, which shapes the prior of the initial RL policy, and its impact on RL convergence, which governs the extent to which that policy can be improved via online interactions. These results suggest that mid-training is most effective when the decision space is compact and the effective horizon is short, highlighting the importance of operating in the space of action abstractions rather than primitive actions. Building on these insights, we propose Reasoning as Action Abstractions (RA3), a scalable mid-training algorithm. Specifically, we derive a sequential variational lower bound and optimize it by iteratively discovering temporally-consistent latent structures via RL, followed by fine-tuning on the bootstrapped data. Experiments on code generation tasks demonstrate the effectiveness of our approach. Across multiple base models, RA3 improves the average performance on HumanEval and MBPP by 8 and 4 points over the base model and the next-token prediction baseline. Furthermore, RA3 achieves faster convergence and higher asymptotic performance in RLVR on HumanEval+, MBPP+, LiveCodeBench, and Codeforces.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Machine Learning

2509.2581

Country:

North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Middle East > Iraq > Basra Governorate > Basra (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.89)

Add feedback

Analogy making as amortised model construction

Nagy, David G., Shen, Tingke, Zhou, Hanqi, Wu, Charley M., Dayan, Peter

arXiv.org Artificial IntelligenceJul-23-2025

Humans flexibly construct internal models to navigate novel situations. To be useful, these internal models must be sufficiently faithful to the environment that resource-limited planning leads to adequate outcomes; equally, they must be tractable to construct in the first place. We argue that analogy plays a central role in these processes, enabling agents to reuse solution-relevant structure from past experiences and amortise the computational costs of both model construction (construal) and planning. Formalis-ing analogies as partial homomorphisms between Markov decision processes, we sketch a framework in which abstract modules, derived from previous construals, serve as com-posable building blocks for new ones. This modular reuse allows for flexible adaptation of policies and representations across domains with shared structural essence.

analogy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.16511

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > Germany > Hesse > Darmstadt Region > Darmstadt (0.04)
(4 more...)

Genre: Research Report (0.40)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Efficient Monte Carlo Tree Search via On-the-Fly State-Conditioned Action Abstraction

Kwak, Yunhyeok, Hwang, Inwoo, Kim, Dooyoung, Lee, Sanghack, Zhang, Byoung-Tak

arXiv.org Artificial IntelligenceJun-2-2024

Monte Carlo Tree Search (MCTS) has showcased its efficacy across a broad spectrum of decision-making problems. However, its performance often degrades under vast combinatorial action space, especially where an action is composed of multiple sub-actions. In this work, we propose an action abstraction based on the compositional structure between a state and sub-actions for improving the efficiency of MCTS under a factored action space. Our method learns a latent dynamics model with an auxiliary network that captures sub-actions relevant to the transition on the current state, which we call state-conditioned action abstraction. Notably, it infers such compositional relationships from high-dimensional observations without the known environment model. During the tree traversal, our method constructs the state-conditioned action abstraction for each node on-the-fly, reducing the search space by discarding the exploration of redundant sub-actions. Experimental results demonstrate the superior sample efficiency of our method compared to vanilla MuZero, which suffers from expansive action space.

action space, artificial intelligence, planning & scheduling, (15 more...)

arXiv.org Artificial Intelligence

2406.00614

Country:

Asia > South Korea > Seoul > Seoul (0.04)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Leisure & Entertainment > Games (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.85)

Add feedback

Learning Planning Abstractions from Language

Liu, Weiyu, Chen, Geng, Hsu, Joy, Mao, Jiayuan, Wu, Jiajun

arXiv.org Artificial IntelligenceMay-6-2024

This paper presents a framework for learning state and action abstractions in sequential decision-making domains. Our framework, planning abstraction from language (PARL), utilizes language-annotated demonstrations to automatically discover a symbolic and abstract action space and induce a latent state abstraction based on it. PARL consists of three stages: 1) recovering object-level and action concepts, 2) learning state abstractions, abstract action feasibility, and transition models, and 3) applying low-level policies for abstract actions. During inference, given the task description, PARL first makes abstract action plans using the latent transition and feasibility functions, then refines the high-level plan using low-level policies. PARL generalizes across scenarios involving novel object instances and environments, unseen concept compositions, and tasks that require longer planning horizons than settings it is trained on.

machine learning, natural language, reinforcement learning, (22 more...)

arXiv.org Artificial Intelligence

2405.03864

Country: North America > United States > Oregon (0.04)

Genre:

Workflow (0.68)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

RL-CFR: Improving Action Abstraction for Imperfect Information Extensive-Form Games with Reinforcement Learning

Li, Boning, Fang, Zhixuan, Huang, Longbo

arXiv.org Artificial IntelligenceMar-7-2024

Effective action abstraction is crucial in tackling challenges associated with large action spaces in Imperfect Information Extensive-Form Games (IIEFGs). However, due to the vast state space and computational complexity in IIEFGs, existing methods often rely on fixed abstractions, resulting in sub-optimal performance. In response, we introduce RL-CFR, a novel reinforcement learning (RL) approach for dynamic action abstraction. RL-CFR builds upon our innovative Markov Decision Process (MDP) formulation, with states corresponding to public information and actions represented as feature vectors indicating specific action abstractions. The reward is defined as the expected payoff difference between the selected and default action abstractions. RL-CFR constructs a game tree with RL-guided action abstractions and utilizes counterfactual regret minimization (CFR) for strategy derivation. Impressively, it can be trained from scratch, achieving higher expected payoff without increased CFR solving time. In experiments on Heads-up No-limit Texas Hold'em, RL-CFR outperforms ReBeL's replication and Slumbot, demonstrating significant win-rate margins of $64\pm 11$ and $84\pm 17$ mbb/hand, respectively.

action abstraction, machine learning, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2403.04344

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(28 more...)

Genre: Research Report > New Finding (0.93)

Industry: Leisure & Entertainment > Games > Poker (0.49)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(2 more...)

Add feedback

Learning Augmented, Multi-Robot Long-Horizon Navigation in Partially Mapped Environments

Khanal, Abhish, Stein, Gregory J.

arXiv.org Artificial IntelligenceMar-29-2023

We present a novel approach for efficient and reliable goal-directed long-horizon navigation for a multi-robot team in a structured, unknown environment by predicting statistics of unknown space. Building on recent work in learning-augmented model based planning under uncertainty, we introduce a high-level state and action abstraction that lets us approximate the challenging Dec-POMDP into a tractable stochastic MDP. Our Multi-Robot Learning over Subgoals Planner (MR-LSP) guides agents towards coordinated exploration of regions more likely to reach the unseen goal. We demonstrate improvement in cost against other multi-robot strategies; in simulated office-like environments, we show that our approach saves 13.29% (2 robot) and 4.6% (3 robot) average cost versus standard non-learned optimistic planning and a learning-informed baseline.

artificial intelligence, machine learning, robot, (16 more...)

arXiv.org Artificial Intelligence

2303.16654

Country: North America > United States > Virginia > Fairfax County > Fairfax (0.04)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)

Add feedback

Asymmetric Action Abstractions for Planning in Real-Time Strategy Games

Moraes, Rubens O. | Nascimento, Mario A. | Lelis, Levi H.S. (a:1:{s:5:"en_US";s:21:"University of Alberta";})

Journal of Artificial Intelligence ResearchNov-30-2022

Action abstractions restrict the number of legal actions available for real-time planning in zero-sum extensive-form games, thus allowing algorithms to focus their search on a set of promising actions. Even though unabstracted game trees can lead to optimal policies, due to real-time constraints and the tree size, they are not a practical choice. In this context, we introduce an action abstraction scheme which we call asymmetric action abstraction. Asymmetric abstractions allow search algorithms to "pay more attention" to some aspects of the game by unevenly dividing the algorithm's search effort amongst different aspects of the game. We also introduce four algorithms that search in asymmetrically abstracted game trees to evaluate the effectiveness of our abstraction schemes. Two of our algorithms are adaptations of algorithms developed for searching in action-abstracted spaces, Portfolio Greedy Search and Stratified Strategy Selection, and the other two are adaptations of an algorithm developed for searching in unabstracted spaces, NaïveMCTS. An extensive set of experiments in a real-time strategy game shows that search algorithms using asymmetric abstractions are able to outperform all other search algorithms tested.

artificial intelligence, machine learning, real time system, (16 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13769

AI Access Foundation

13769

Journal of Artificial Intelligence Research

Country:

North America > Canada > Alberta (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
South America > Brazil (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Leisure & Entertainment > Games > Computer Games (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Architecture > Real Time Systems (1.00)

Add feedback

Moraes

AAAI ConferencesFeb-8-2022, 09:28:33 GMT

Search algorithms based on combinatorial multi-armed bandits (CMABs) are promising for dealing with state-space sequential decision problems. However, current CMAB-based algorithms do not scale to problem domains with very large actions spaces, such as real-time strategy games played in large maps. In this paper we introduce CMAB-based search algorithms that use action abstraction schemes to reduce the action space considered during search. One of the approaches we introduce use regular action abstractions (A1N), while the other two use asymmetric action abstractions (A2N and A3N). Empirical results on MicroRTS show that A1N, A2N, and A3N are able to outperform an existing CMAB-based algorithm in matches played in large maps, and A3N is able to outperform all state-of-the-art search algorithms tested.

algorithm, artificial intelligence, cmab-based algorithm, (5 more...)

AAAI Conferences

Industry: Leisure & Entertainment > Games (0.69)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)

Add feedback

Action Abstractions for Combinatorial Multi-Armed Bandit Tree Search

Moraes, Rubens O. (Universidade Federal de Viçosa) | Mariño, Julian R. H. (Universidade de São Paulo) | Lelis, Levi H. S. (Universidade Federal de Viçosa) | Nascimento, Mario A. (University of Alberta)

AAAI ConferencesNov-7-2018

action abstraction, artificial intelligence, combinatorial multi-armed bandit tree search

AAAI Conferences

Fourteenth Artificial Intelligence and Interactive Digital Entertainment Conference

Industry: Leisure & Entertainment > Games (0.53)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)

Add feedback

Depth-Limited Solving for Imperfect-Information Games

Brown, Noam, Sandholm, Tuomas, Amos, Brandon

arXiv.org Artificial IntelligenceMay-21-2018

A fundamental challenge in imperfect-information games is that states do not have well-defined values. As a result, depth-limited search algorithms used in single-agent settings and perfect-information games do not apply. This paper introduces a principled way to conduct depth-limited solving in imperfect-information games by allowing the opponent to choose among a number of strategies for the remainder of the game at the depth limit. Each one of these strategies results in a different set of values for leaf nodes. This forces an agent to be robust to the different strategies an opponent may employ. We demonstrate the effectiveness of this approach by building a master-level heads-up no-limit Texas hold'em poker AI that defeats two prior top agents using only a 4-core CPU and 16 GB of memory. Developing such a powerful agent would have previously required a supercomputer.

artificial intelligence, machine learning, subgame, (16 more...)

arXiv.org Artificial Intelligence

1805.08195

Country:

North America > United States > Texas (0.25)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (1.00)

Industry: Leisure & Entertainment > Games > Poker (0.67)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback